Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 [WIP] Use SPDY over websockets and fallback to direct SPDY on client side #11125

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chrischdi
Copy link
Member

What this PR does / why we need it:

In Kubernetes they switched over to using websockets per default instead of SPDY and only fallback to spdy:

/area provider/control-plane-kubeadm

Note: The inmemory provider currently only supports SPDY on server-side.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added the area/provider/control-plane-kubeadm Issues or PRs related to KCP label Sep 2, 2024
@chrischdi
Copy link
Member Author

/test help

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 2, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from chrischdi. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-blocking-main
  • /test pull-cluster-api-e2e-conformance-ci-latest-main
  • /test pull-cluster-api-e2e-conformance-main
  • /test pull-cluster-api-e2e-latestk8s-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-upgrade-1-31-1-32-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-test-mink8s-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-blocking-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 2, 2024
@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-conformance-main
/test pull-cluster-api-e2e-latestk8s-main
/test pull-cluster-api-e2e-main
/test pull-cluster-api-e2e-mink8s-main

@chrischdi
Copy link
Member Author

/retest

flake


websocketDialer, err := portforward.NewSPDYOverWebsocketDialer(req.URL(), d.restConfig)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe let's wrap the error

@@ -26,7 +26,9 @@ import (
"github.com/pkg/errors"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could it be that PR title and description are slightly incorrect?

It seems to me that we are always using SPDY. The question is just if it's over webhooks or directly?

dialer := spdy.NewDialer(d.upgrader, &http.Client{Transport: d.proxyTransport}, "POST", req.URL())
spdyDialer := spdy.NewDialer(d.upgrader, &http.Client{Transport: d.proxyTransport}, "POST", req.URL())

websocketDialer, err := portforward.NewSPDYOverWebsocketDialer(req.URL(), d.restConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if this only works with certain Kubernetes apiserver versions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without diving to deep into the upstream k/k PR it looks to me like there is a server-side feature-gate to make this possible (PortForwardWebsockets) and it's only enabled per default with 1.31?
Is this correct?

xref:

If yes, this means for Kubernetes < 1.31 we are now with every single reconcile trying the websocketDialer but it won't work. Sounds like a lot of unnecessary "work" (not sure what it's exactly doing, requests?). We should probably try to avoid this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Wondering if we should have two dials method and invoke them depending on the k8s version, or we should have a cache (might be a TTL cache) where we track IP that do not support web sockets

Copy link
Member Author

@chrischdi chrischdi Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We maybe should configure the fallback depending on the k8s version.

< 1.31 : spdy first, websocket second
>= 1.31 : websocket first, spdy second

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be the simple option, it just doesn't take care of the case where someone disables the feature gate in 1.31.

Do we have an idea of how high the impact is of falling back on every single reconcile?

Copy link
Member

@sbueringer sbueringer Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait. We're talking about KCP, so KCP would actually know if someone enables/disables the feature gate (and not just the default is used). Although this could be tricky during rollouts, but probably fine if we just temporarily fallback a few additional times (should be okay as long as we don't permanently try to do something that has no chance of working?)

}

// First attempt tunneling (websocket) dialer, then fallback to spdy dialer.
dialer := portforward.NewFallbackDialer(websocketDialer, spdyDialer, func(err error) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we can't tell from green e2e tests that websocketDialer worked, right? (it could be that we are always falling back to the spdyDialer, which I assume would lead to a lot of unnecessary stuff being done)

Would it make sense to check it once with Tilt manually? (if not already done)

@chrischdi
Copy link
Member Author

Moving back to WIP
/retitle 🌱 [WIP] Use websockets and fallback to SPDY on client side

@k8s-ci-robot k8s-ci-robot changed the title 🌱 Use websockets and fallback to SPDY on client side 🌱 [WIP] Use websockets and fallback to SPDY on client side Sep 3, 2024
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! this was also in my todo list, but unfortunately with low priority 😅
(And it is still there for the server implementation in the inmemory provider)

@sbueringer
Copy link
Member

Yeah we noticed through 1.31 relase notes that the server-side feature is now enabled per default.

@chrischdi chrischdi changed the title 🌱 [WIP] Use websockets and fallback to SPDY on client side 🌱 [WIP] Use SPDY over websockets and fallback to direct SPDY on client side Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/control-plane-kubeadm Issues or PRs related to KCP cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants